***Patrick Estep***

***10010 Waterview Pkwy***

***Rowlett, TX 75089***

***(214) 733-1409***

**Strengths** Creative problem solver. Experience with AI workload analysis and creation of performant AI hardware architectures. Experience with memory system modeling, analysis & architecture. Experience on Linux systems in the following: distributed computing, parallel programming, device drivers, network programming, systems programming, and performance tuning.

**Languages** C, C++, Python, shell, x86 & RISC-V assembly

**Professional Experience**

2018-Present Micron Technology, Allen TX

Senior Member Technical Staff

2022-Present: Software lead for Memory Lake architecture project

* Led use case exploration and analysis across a variety of application areas, including Machine Learning and graph analytics
* Architecting features to improve performance, and to enable multi-tiered memories.
* Drove initiative to enable CZ120 usage as a CXL fabric attached memory.
* Provided demonstrations at Supercomputing 2023, and gave an invited talk “Scaling for Machine Learning using CXL 3.1 and NMC”
* Drove aggressive schedule to deliver PNNL proof-of-concept platform, which is the first platform in the industry capable of running applications at scale using CXL shared memory using a CXL switch.

2018-Present: Software lead for work with government partners architecting Near Memory Compute (NMC) architectures

* Leading software effort with Pacific Northwest National Labs (PNNL). Architected software enabling applications to run with CXL global shared memory, including virtualization technology to allow application bringup prior to hardware availability. Architected emulation platform which allows rapid architectural explorations, performance analysis, and application porting (5 patents filed)
* Led software effort with Office Navy Research (ONR). Created novel machine learning networks to perform sea vessel classification. Led initiative to develop a Reinforcement Learning (RL) approach to mapping operations to a datastream processor (3 patents filed)
* Led software effort with Air Force Research Labs (AFRL). Accomplishments included a novel NMC architecture capable of running signal processing codes using an order of magnitude less power than state of the art GPGPUs, and a second NMC architecture capable of running sparse graph analytic problems with linear scaling (11 patents filed)

2020-2022: Cache architect for Lake Havasu CXL memory module

* Created tool to allow rapid exploration of different cache architectures. This tool was the primary analysis tool used in architecting the Lake Havasu CXL cache.
* Drove architectural features into Lake Havasu improving performance. Provided analysis and architectural options for Lake Havasu reliability and security (6 patents filed)
* Provided customer presentations and analysis, worked with cross functional teams providing performance emulation and simulation analysis.

2019-2020: Allen Software lead working on Micron Deep Learning Accelerator (MDLA)

* Created AI hardware lab in Allen, TX
* Developed new testing methodologies which significantly improved MDLA reliability.
* Drove architectural changes into MDLA improving performance (1 patent filed)
* Customer liaison for MDLA partner Continental AG, responsible for porting & tuning ML models to MDLA architecture (1 patent filed)

1992-2018: DXC Technology (formerly Hewlett-Packard, formerly Convex Computer Corporation), Plano, TX

Systems Software Engineer VI, Expert

2009-2018 SeaQuest project (Linux port of HP Neoview data warehouse)

* Responsible for development of a new mechanism to export data from the SeaQuest platform to Hadoop datalake systems.
  + Achieved 30x speed improvement and improved reliability.
  + Data conversion for native Hadoop ingestion
* Responsible for development of a new communications infrastructure to support SeaQuest scalability and fault tolerance requirements.
  + Created kernel module to support intranode DMA and new RDMA transfer methods.
  + Reduced number of InfiniBand network connections by 1000x
  + Improved system performance by 10%
  + Reduced memory usage by up to 30GB per node
* Worked with InfiniBand vendors to troubleshoot performance and stability issues.
* Created tools to mine problematic performance windows from collectl data.
* Created tools to identify network performance problems.

2005-2009 XC cluster group (industry-grade Linux cluster system)

* Team lead for project to enhance XC product for use by an enterprise database application.
* Triaged network performance issues across hardware, firmware and software.

2004-2005: Scalable visualization group (scalable visualization solution using Linux clusters and hardware image compositors).

* Implemented server software to acquire an incoming 3D image and copy it to a 2D window.
* Developed software functions to keep multi-tiled applications synchronized.

1995-2004: Message Passing Interface (MPI) group

* Technical lead for the HP MPI product.
* Developed industry first sub-1-microsecond point-to-point communications.
* Extensive performance optimizations of MPI-1 collective functions for NUMA platforms.
* Integrated low latency interconnects with the MPI product.
* Collaborated with HP-UX team on development of RDMA protocols and NUMA (Non-Uniform Memory Access) programming models.
* Developed profiling interfaces to monitor messaging statistics.
* Provided direct customer support, including training, documentation, technical assistance, and technical presentations.

1994-1995: Convex OS group

* Integrated gdb with Convex OS.
* Performance work and bug fixes.

1992-1994: Convex cluster project

* Responsible for the port of PVM (Parallel Virtual Machine, a distributed programming library) to the Convex and HP architectures.
* Extended PVM to use a custom shared memory network (this was the first shared memory PVM implementation in the industry).
* Created a version of PVM for ConvexSPP (Convex's massively parallel machine).
* Other enhancements include performance improvements, functional extensions, and bug fixes.

1990 to 1992

EG&G, Inc. – Superconducting Super Collider Laboratory, Dallas, TX

Senior UNIX Systems Programmer 1991 to 1992

UNIX Systems Programmer 1990 to 1991

* Supported the development and maintenance of UNIX systems level software for Physics Computing Systems.
* Development of a data management system, which managed the data transfer between on-line disk storage and off-line tape storage.
* Developed a device driver for a D-2 tape drive using an IPI-3 controller on a VME bus.
* Developed a workstation allocation system to extend UNIX login functionality with load balancing features.
* Ported NQS (a distributed UNIX batch system) from SUN to SGI.
* Supported a distributed programming library (Cooperative Process Software).

1986 to 1990

Bell Helicopter, Fort Worth, TX

Computing Project Engineer 1990

Senior Computing Engineer 1988 to 1990

Computing Engineer 1986 to 1988

* Technical lead of real-time graphics group of five employees.
* Developed displays for real-time man-in-the-loop helicopter simulation.
* Real-time UNIX/C development in a distributed computing environment with a strong computer graphics emphasis.

1985 to 1986

General Dynamics - Fort Worth, TX

Software Engineer 1985 to 1986

Associate Software Engineer 1985

* Developed avionics simulations and displays for an F-16 avionics familiarization trainer.
* In addition to technical duties, was responsible for technical supervision of four employees.

August 1984 to December 1984

Southeastern Oklahoma State University, Durant, OK

Instructor

* Taught two BASIC programming classes.
* Duties included class design and instruction.

**Education**

Southern Methodist University, Dallas, TX

MS/CS, May 1992

Southeastern Oklahoma State University, Durant, OK

BS/CS & Math, December 1984

**7 Patents granted, 21 Patents pending, 1 innovation award**